First a few packages we will need

knitr::opts_chunk$set(echo = TRUE)
require(raster)
require(sf)
require(ggplot2)
require(sdmpredictors)
require(terra)
require(geodata)
setwd(dirname(rstudioapi::getActiveDocumentContext()$path))
world <- map_data("world")

What is a Raster?

Raster graphics vs Raster layers

You may have heard of Raster graphics before, and you’ve definitely seen them, which is a good place to start.

This image is a raster. It is made up of pixels that have a defined value (color) and point in space (xy).

Rasters contain one or more variables with associated spatial information, basically xyz coordinates in a single object. A good example of this is geo-referenced environmental data, let’s whip up an example from Bio-ORACLE 2. These are not to be confused with vectors, which are collections of points, lines, or polygons that share a data type.

require(sdmpredictors)
pred_datasets <- list_datasets(terrestrial = TRUE, marine = TRUE)
pred_datasets
##   dataset_code terrestrial marine                              url
## 1    WorldClim        TRUE  FALSE        http://www.worldclim.org/
## 2   Bio-ORACLE       FALSE   TRUE           hhtp://bio-oracle.org/
## 3      MARSPEC       FALSE   TRUE              http://marspec.org/
## 4      ENVIREM        TRUE  FALSE       https://envirem.github.io/
## 5   Freshwater        TRUE  FALSE https://www.earthenv.org/streams
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                description
## 1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   WorldClim is a set of global climate layers (climate grids). Note that all data has been transformed back to real values, so there is no need to e.g. divide temperature layers by 10.
## 2                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                Bio-ORACLE is a set of GIS rasters providing geophysical, biotic and environmental data for surface and benthic marine realms at a spatial resolution 5 arcmin (9.2 km) in the ESRI ascii and tif format.
## 3 MARSPEC is a set of high resolution climatic and geophysical GIS data layers for the world ocean. Seven geophysical variables were derived from the SRTM30_PLUS high resolution bathymetry dataset.  These layers characterize the horizontal orientation (aspect), slope, and curvature of the seafloor and the distance from shore.  Ten "bioclimatic" variables were derived from NOAA's World Ocean Atlas and NASA's MODIS satellite imagery and characterize the inter-annual means, extremes, and variances in sea surface temperature and salinity. These variables will be useful to those interested in the spatial ecology of marine shallow-water and surface-associated pelagic organisms across the globe. Note that, in contrary to the original MARSPEC, all layers have unscaled values.
## 4                                                                                                                                 The ENVIREM dataset is a set of 16 climatic and 2 topographic variables that can be used in modeling species' distributions. The strengths of this dataset include their close ties to ecological processes, and their availability at a global scale, at several spatial resolutions, and for several time periods. The underlying temperature and precipitation data that went into their construction comes from the WorldClim dataset (www.worldclim.org), and the solar radiation data comes from the Consortium for Spatial Information (www.cgiar-csi.org). The data are compatible with and expand the set of variables from WorldClim v1.4 (www.worldclim.org).
## 5                                                      The dataset consists of near-global, spatially continuous, and freshwater-specific environmental variables in a standardized 1km grid. We delineated the sub-catchment for each grid cell along the HydroSHEDS river network and summarized the upstream environment (climate, topography, land cover, surface geology and soil) to each grid cell using various metrics (average, minimum, maximum, range, sum, inverse distance-weighted average and sum). All variables were subsequently averaged across single lakes and reservoirs of the Global lakes and Wetlands Database that are connected to the river network. Monthly climate variables were summarized into 19 long-term climatic variables following the \xd2bioclim\xd3 framework.
##                                                                                                                                                                                                                                                citation
## 1                                                Hijmans, R.J., S.E. Cameron, J.L. Parra, P.G. Jones and A. Jarvis, 2005. Very high resolution interpolated climate surfaces for global land areas. International Journal of Climatology 25: 1965-1978.
## 2 Tyberghein L., Verbruggen H., Pauly K., Troupin C., Mineur F. & De Clerck O. Bio-ORACLE: a global environmental dataset for marine species distribution modeling. Global Ecology and Biogeography. http://dx.doi.org/10.1111/j.1466-8238.2011.00656.x
## 3                                                                                                      Sbrocco, EJ and Barber, PH (2013) MARSPEC: Ocean climate layers for marine spatial ecology. Ecology 94: 979. http://dx.doi.org/10.1890/12-1358.1
## 4                                    Title, P.O., Bemmels, J.B. 2017. ENVIREM: An expanded set of bioclimatic and topographic variables increases flexibility and improves performance of ecological niche modeling. Ecography doi: 10.1111/ecog.02880.
## 5                                              Domisch, S., Amatulli, G., and Jetz, W. (2015) Near-global freshwater-specific environmental variables for biodiversity analyses in 1 km resolution. Scientific Data 2:150073 doi: 10.1038/sdata.2015.73
names(pred_datasets)
## [1] "dataset_code" "terrestrial"  "marine"       "url"          "description" 
## [6] "citation"
# these are the datasets currently available for download using the 'sdmpredictors' package; you can check their URLs for more info on their contents
pred_datasets[ , 1:4]
##   dataset_code terrestrial marine                              url
## 1    WorldClim        TRUE  FALSE        http://www.worldclim.org/
## 2   Bio-ORACLE       FALSE   TRUE           hhtp://bio-oracle.org/
## 3      MARSPEC       FALSE   TRUE              http://marspec.org/
## 4      ENVIREM        TRUE  FALSE       https://envirem.github.io/
## 5   Freshwater        TRUE  FALSE https://www.earthenv.org/streams
pred_layers <- list_layers(datasets = pred_datasets)
unique(pred_layers$dataset_code)
## [1] "WorldClim"  "MARSPEC"    "ENVIREM"    "Freshwater" "Bio-ORACLE"
head(unique(pred_layers[pred_layers$dataset_code == "Bio-ORACLE", ]$name), 10)
##  [1] "Calcite (mean)"           "Cloud fraction (maximum)"
##  [3] "Cloud fraction (mean)"    "Cloud fraction (minimum)"
##  [5] "BO_damax"                 "BO_damean"               
##  [7] "BO_damin"                 "BO_parmax"               
##  [9] "BO_parmean"               "BO_ph"

Selecting layers for download

Selecting layers can be done based on the name or code, which follow general patterns. For example, layers associated with the sea surface end with “_ss” and can be pulled out from the large list of layer codes. From there we can subset the list of all layers with just the information we want.

layer_names <- unique(pred_layers[pred_layers$dataset_code == "Bio-ORACLE", ]$layer_code)
surf <- layer_names[grepl("_ss",layer_names)]
layers_ver2 <- list_layers(datasets = "Bio-ORACLE", version = 2)
surf <- surf[20:25]
layers_choice <- subset(layers_ver2,layer_code %in% surf, select = c("name", "layer_code", "version"))
layers <- rast(load_layers(layers_choice$layer_code, datadir = "./outputs/sdmpredictors"))

Note that this will download the layers if you do not have them already (can take some time) or load them from the specified directory if you do.

So now we have some layers, which we’ve “rasterized” as we downloaded them, but what does this mean? And, more importantly, how do we get the pretty pictures??

layers
## class       : SpatRaster 
## dimensions  : 2160, 4320, 6  (nrow, ncol, nlyr)
## resolution  : 0.08333333, 0.08333333  (x, y)
## extent      : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (with axis order normalized for visualization) 
## sources     : BO2_tempmean_ss_lonlat.zip  
##               BO2_tempmin_ss_lonlat.zip  
##               BO2_temprange_ss_lonlat.zip  
##               ... and 3 more source(s)
## names       : BO2_t~an_ss, BO2_t~in_ss, BO2_t~ge_ss, BO2_c~ax_ss, BO2_c~an_ss, BO2_c~in_ss 
## min values  :   -1.797733,    -1.94000,        0.08,     0.01794,    0.015806,    0.010986 
## max values  :   30.178629,    29.35999,       28.95,    13.41527,    5.394896,    2.521814
head(terra::as.data.frame(layers[[1]], xy= T))
##           x        y BO2_tempmean_ss
## 1 -179.9583 89.95833       -1.716213
## 2 -179.8750 89.95833       -1.716213
## 3 -179.7917 89.95833       -1.716213
## 4 -179.7083 89.95833       -1.716213
## 5 -179.6250 89.95833       -1.716213
## 6 -179.5417 89.95833       -1.716213
raster(layers[[1]])
## class      : RasterLayer 
## dimensions : 2160, 4320, 9331200  (nrow, ncol, ncell)
## resolution : 0.08333333, 0.08333333  (x, y)
## extent     : -180, 180, -90, 90  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : BO2_tempmean_ss_lonlat.zip 
## names      : BO2_tempmean_ss 
## values     : -1.797733, 30.17863  (min, max)
head(as.data.frame(raster(layers[[1]]), xy=T))
##           x        y BO2_tempmean_ss
## 1 -179.9583 89.95833       -1.716213
## 2 -179.8750 89.95833       -1.716213
## 3 -179.7917 89.95833       -1.716213
## 4 -179.7083 89.95833       -1.716213
## 5 -179.6250 89.95833       -1.716213
## 6 -179.5417 89.95833       -1.716213
plot(layers)

Whether working with RasterLayer and RasterStack in the raster package or SpatRaster in the terra package, the most important thing is consistency and the general structure is the same: geometry, value.

While rasters may look quite uniform, or smooth, they are actually made up of pixel grid cells, just like a digital image. The coordinate system and resolution of this grid is important, as it will determine the alignment between your different variables. Try to guess where in the world this is by the marine raster data alone.

You probably didn’t guess Saltstraumen, especially because, according to the raster grid, the point isn’t even in the ocean! If you are working on fine spatial scales, make sure your raster data matches in order to meaningfully address your research questions. For a global study, the outputs from Bio-ORACLE are just fine, but if we were interested in within-fjord oceanographic dynamics we’d need a higher resolution data source.

Once you have your rasters, you can perform basic (and non-basic) math operations on them like any other variable.

primed_layers <- c("BO2_tempmin_ss","BO2_tempmax_ss")
layers_choice <- subset(layers_ver2,layer_code %in% primed_layers, select = c("name", "layer_code", "version"))
primed_layers <- rast(load_layers(layers_choice$layer_code, datadir = "../outputs/sdmpredictors"))
# raster math
t_diff <- primed_layers$BO2_tempmax_ss - primed_layers$BO2_tempmin_ss
names(t_diff) <- 't_diff'
add(primed_layers) <- t_diff

*Note: for these plots I had to use the tidyterra package in order to plot spatrasters within ggplot2, but for other raster and polygon structures, ggplot2 has built in functions such as geom_raster() and geom_sf().

Raster manip: cropping and masking

library(geodata)
countries <- world(path = "../outputs/countries")
countries <- disagg(countries)
include <- c("Norway", "Iceland", "Svalbard and Jan Mayen", "Denmark", "Finland", "Faroe Islands", "Åland", "Sweden", "Greenland")
pres_buff <- aggregate(buffer(subset(countries, countries$NAME_0 %in% include), width = 200000))
plot(pres_buff, col = "lightgrey", border = 'red')
plot(countries, border = "darkgrey", add = TRUE)

layers_cut <- crop(primed_layers, pres_buff, mask = F)
ggplot() +
  geom_spatraster(data = layers_cut, aes()) +
  scale_fill_gradient2(
  low = "green",
  mid = 'blue',
  high = "red",
  midpoint = 10,
  space = "Lab",
  na.value = "grey50",
  guide = "colourbar",
  aesthetics = "fill"
) +
    geom_map(
        data = world, map = world,
        aes(map_id = region),
        color = "black", fill = "lightgray", linewidth = 0.01
      ) +
  labs(title = 'Cropped') +
  facet_wrap(~lyr) +
  theme(plot.title = element_text(hjust = 0.5))
## SpatRaster resampled to ncells = 500400

So cropping will always result in a rectangular raster cropped to the window of our choosing, even if the object we are giving as a reference is not rectangular. If we want a more accurate matchup between our objects we need to “mask” the raster to the reference. This only works if your reference is a raster, polygon, or spat object ie it needs spatial information on the same coordinate system.

layers_cut <- crop(primed_layers, pres_buff, mask = TRUE)
ggplot() +
  geom_spatraster(data = layers_cut, aes()) +
  scale_fill_gradient2(
  low = "green",
  mid = 'blue',
  high = "red",
  midpoint = 10,
  space = "Lab",
  na.value = "grey50",
  guide = "colourbar",
  aesthetics = "fill"
) +
  geom_map(
        data = world, map = world,
        aes(map_id = region),
        color = "black", fill = "lightgray", linewidth = 0.01
      ) +
  labs(title = 'Masked') +
  facet_wrap(~lyr) +
  theme(plot.title = element_text(hjust = 0.5))

To summarize:

Rasters are great for managing data of all types that is bound to a coordinate system. Rasters boil down, for the most part, to a dataframe with xy geometry and a value and can be converted to and from this format. When using rasters, consider the source, as resolutions and coordinate systems can vary greatly, but must be consistent to make meaningful comparisons.

Questions?

Recording available on the marinetics website

More examples:

Output of a species distribution model (favorability) for Tuna under an intense climate change scenario

Overplotting of kelp occurrence from different databases and sea surface temperature

Zonation4 vs. Zonation5 output (courtesy of Tilly), a spatial prioritization analysis that highlights areas with highest modeled biodiversity.